Skip to content

Conversation

@AnthonyRonning
Copy link
Contributor

@AnthonyRonning AnthonyRonning commented Dec 31, 2025

Summary

Adds support for text embeddings via the OpenAI-compatible /v1/embeddings endpoint using Tinfoil's nomic-embed-text model (768 dimensions).

Changes

tinfoil-proxy (Go)

  • Enabled nomic-embed-text model in model configs
  • Added EmbeddingRequest, EmbeddingResponse, EmbeddingData, EmbeddingUsage types
  • Added handleEmbeddings handler with proper input handling (single string or array)
  • Registered /v1/embeddings POST route

opensecret (Rust)

  • Added EmbeddingRequest struct in src/web/openai.rs
  • Added proxy_embeddings handler with:
    • Guest user billing checks
    • Input validation
    • Encryption middleware integration
    • Billing/usage tracking (prompt_tokens only, completion_tokens=0)
  • Added nomic-embed-text to proxy router config and models list in src/proxy_config.rs
  • Registered /v1/embeddings route

API Format

Follows standard OpenAI embeddings API:

// Request
POST /v1/embeddings
{
  "input": "text to embed" | ["text1", "text2"],
  "model": "nomic-embed-text"
}

// Response
{
  "object": "list",
  "data": [{"object": "embedding", "index": 0, "embedding": [...768 floats...]}],
  "model": "nomic-embed-text",
  "usage": {"prompt_tokens": N, "total_tokens": N}
}

Testing

  • Tested with Rust SDK integration tests (single input, multiple inputs, string conversion)
  • All tests pass with correct 768-dimension embeddings
  • Billing events properly emitted to SQS

Summary by CodeRabbit

New Features

  • Added embeddings API endpoint (/v1/embeddings) with model selection, input validation, and encoding format configuration.
  • Made "nomic-embed-text" model available.
  • Implemented usage tracking and billing for embeddings requests.

✏️ Tip: You can customize this high-level summary in your review settings.

AnthonyRonning and others added 2 commits December 31, 2025 13:18
- Add /v1/embeddings endpoint to tinfoil-proxy (Go) with handleEmbeddings handler
- Enable nomic-embed-text model in tinfoil-proxy model configs
- Add proxy_embeddings handler in Rust with encryption middleware
- Add nomic-embed-text to proxy router config and models list
- Include billing/usage tracking for embeddings (prompt_tokens only)
- Follows OpenAI embeddings API format (768 dimensions)

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@coderabbitai
Copy link

coderabbitai bot commented Dec 31, 2025

Walkthrough

This PR introduces embeddings API support across the proxy stack. It adds a new "nomic-embed-text" model to the configuration, implements an embeddings endpoint in the Rust service with request validation, billing integration, and response encryption, and extends the Go proxy with embeddings handler and types.

Changes

Cohort / File(s) Summary
PCR History Data
pcrDevHistory.json, pcrProdHistory.json
Each file receives a single new PCR history entry with values for PCR0, PCR1, PCR2, timestamp, and signature; no modifications to existing records.
Model Configuration
src/proxy_config.rs
Registers "nomic-embed-text" model in Tinfoil-only routes and static models list; exposes model through public API.
Rust Embeddings Endpoint
src/web/openai.rs
Adds new embeddings endpoint (/v1/embeddings) with EmbeddingRequest struct, proxy_embeddings handler, input validation, provider routing, usage/billing event publishing (prompt_tokens), response encryption, and error handling for timeouts and non-success responses. Note: duplicated additions detected in the file.
Go Proxy Embeddings Handler
tinfoil-proxy/main.go
Introduces EmbeddingRequest, EmbeddingData, EmbeddingUsage, EmbeddingResponse types and implements handleEmbeddings method; wires POST /v1/embeddings route with client reinitialization on certificate errors.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant RustService as Rust Service
    participant ProxyRouter as Proxy Router
    participant GoProxy as Go Proxy<br/>(Tinfoil)
    participant Provider as External<br/>Provider

    Client->>RustService: POST /v1/embeddings<br/>(EmbeddingRequest)
    Note over RustService: Decrypt request<br/>Validate input (non-empty)
    alt Input invalid
        RustService-->>Client: Error response
    else Proceed
        RustService->>RustService: Check billing<br/>(guest user gate)
        alt Guest billing blocked
            RustService-->>Client: Billing error
        else Proceed
            RustService->>ProxyRouter: Resolve model route<br/>(nomic-embed-text)
            ProxyRouter-->>RustService: Go Proxy endpoint
            RustService->>GoProxy: Forward embeddings request
            GoProxy->>Provider: POST /v1/embeddings
            alt Timeout or non-200 response
                Provider-->>GoProxy: Error
                GoProxy-->>RustService: Error response
                RustService-->>Client: Error (encrypted)
            else Success
                Provider-->>GoProxy: EmbeddingResponse
                rect rgb(220, 240, 230)
                Note over GoProxy: Parse embeddings<br/>Convert to Go types
                end
                GoProxy-->>RustService: Parsed response
                rect rgb(240, 230, 220)
                Note over RustService: Publish usage event<br/>(prompt_tokens)<br/>Encrypt response
                end
                RustService-->>Client: Encrypted response
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 Embeddings bloom in the meadow of code,
A rabbit hops through each node and load,
From Rust to Go, the data does flow,
Model names whisper what proxies will know,
Billing bells ring as vectors take flight! 🌟

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding an embeddings API endpoint with the nomic-embed-text model. It directly reflects the primary objective and is specific enough to convey the core contribution.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/embeddings-api

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link

greptile-apps bot commented Dec 31, 2025

Greptile Summary

Adds text embeddings support via a new /v1/embeddings endpoint using Tinfoil's nomic-embed-text model (768 dimensions). The implementation follows OpenAI's embeddings API format and integrates with the existing proxy architecture.

  • Added EmbeddingRequest struct and proxy_embeddings handler in Rust with guest user billing checks, input validation, and usage tracking via SQS
  • Added handleEmbeddings handler in Go tinfoil-proxy with input type conversion (string or array of strings) and proper error handling
  • Registered nomic-embed-text model in proxy router config and enabled it in tinfoil model configs
  • Updated PCR values for dev and prod environments to reflect the new binary

Confidence Score: 5/5

  • This PR is safe to merge - it adds a new endpoint following established patterns with proper validation and billing.
  • The implementation closely follows existing patterns for TTS and transcription endpoints. Input validation, guest user billing checks, and usage tracking are all properly implemented. No breaking changes to existing functionality.
  • No files require special attention.

Important Files Changed

Filename Overview
src/web/openai.rs Adds proxy_embeddings handler with guest user billing checks, input validation, encryption middleware, and usage tracking. Implementation follows established patterns from other endpoints.
tinfoil-proxy/main.go Adds handleEmbeddings handler and EmbeddingRequest/EmbeddingResponse types. Enables nomic-embed-text model and registers /v1/embeddings route.
src/proxy_config.rs Adds nomic-embed-text model to the tinfoil route and available models list.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Rust API (opensecret)
    participant Go Proxy (tinfoil-proxy)
    participant Tinfoil Backend
    participant SQS (Billing)

    Client->>Rust API (opensecret): POST /v1/embeddings (encrypted)
    Rust API (opensecret)->>Rust API (opensecret): Decrypt request
    Rust API (opensecret)->>Rust API (opensecret): Check guest user billing
    Rust API (opensecret)->>Rust API (opensecret): Validate input (non-empty)
    Rust API (opensecret)->>Rust API (opensecret): Get model route config
    Rust API (opensecret)->>Go Proxy (tinfoil-proxy): POST /v1/embeddings
    Go Proxy (tinfoil-proxy)->>Go Proxy (tinfoil-proxy): Parse & validate input
    Go Proxy (tinfoil-proxy)->>Tinfoil Backend: Embeddings.New()
    Tinfoil Backend-->>Go Proxy (tinfoil-proxy): EmbeddingResponse
    Go Proxy (tinfoil-proxy)-->>Rust API (opensecret): JSON response with usage
    Rust API (opensecret)->>SQS (Billing): Publish usage event (async)
    Rust API (opensecret)->>Rust API (opensecret): Encrypt response
    Rust API (opensecret)-->>Client: Encrypted embeddings (768 dims)
Loading

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tinfoil-proxy/main.go (1)

1106-1120: Silent dropping of non-string array items may cause unexpected behavior.

When the input is an array, non-string items are silently filtered out. If a client accidentally sends ["hello", 123, "world"], only ["hello", "world"] would be processed without any indication. Consider logging a warning or returning an error for invalid array items.

🔎 Proposed enhancement
 	case []interface{}:
 		for _, item := range v {
 			if str, ok := item.(string); ok {
 				inputs = append(inputs, str)
+			} else {
+				log.Printf("Warning: non-string item in input array ignored")
 			}
 		}
src/web/openai.rs (1)

1566-1567: Parameter naming inconsistency: _auth_method is actually used.

The parameter _auth_method has an underscore prefix which conventionally indicates an unused parameter, but it's used on line 1704 for billing context. Consider removing the underscore prefix for clarity.

🔎 Proposed fix
 async fn proxy_embeddings(
     State(state): State<Arc<AppState>>,
     _headers: HeaderMap,
     axum::Extension(session_id): axum::Extension<Uuid>,
     axum::Extension(user): axum::Extension<User>,
-    axum::Extension(_auth_method): axum::Extension<AuthMethod>,
+    axum::Extension(auth_method): axum::Extension<AuthMethod>,
     axum::Extension(embedding_request): axum::Extension<EmbeddingRequest>,
 ) -> Result<Json<EncryptedResponse<Value>>, ApiError> {

And on line 1704:

             let billing_context =
-                BillingContext::new(_auth_method, embedding_request.model.clone());
+                BillingContext::new(auth_method, embedding_request.model.clone());
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43e64e0 and fd400f9.

⛔ Files ignored due to path filters (3)
  • pcrDev.json is excluded by !pcrDev.json
  • pcrProd.json is excluded by !pcrProd.json
  • tinfoil-proxy/dist/tinfoil-proxy is excluded by !**/dist/**
📒 Files selected for processing (5)
  • pcrDevHistory.json
  • pcrProdHistory.json
  • src/proxy_config.rs
  • src/web/openai.rs
  • tinfoil-proxy/main.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-26T16:05:09.950Z
Learnt from: AnthonyRonning
Repo: OpenSecretCloud/opensecret PR: 95
File: tinfoil-proxy/main.go:55-59
Timestamp: 2025-08-26T16:05:09.950Z
Learning: The qwen3-coder-480b model is available in Tinfoil's model catalog and can be used in the tinfoil-proxy service.

Applied to files:

  • src/proxy_config.rs
🧬 Code graph analysis (1)
src/web/openai.rs (1)
tinfoil-proxy/main.go (1)
  • EmbeddingRequest (171-177)
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Greptile Review
  • GitHub Check: Development Reproducible Build
🔇 Additional comments (12)
pcrProdHistory.json (1)

470-477: LGTM!

The new PCR history entry follows the established format with valid PCR0, PCR1, PCR2 values, timestamp, and signature. The structure is consistent with existing entries.

pcrDevHistory.json (1)

470-477: LGTM!

The dev environment PCR history entry is properly formatted and follows the established pattern.

tinfoil-proxy/main.go (3)

43-47: LGTM!

The nomic-embed-text model is correctly added to the model configurations with an appropriate description and active status.


171-195: LGTM!

The embedding types are well-structured and align with the OpenAI embeddings API specification. Using interface{} for Input correctly handles both string and array inputs, and the EmbeddingUsage struct appropriately omits CompletionTokens since embeddings only consume prompt tokens.


1256-1259: LGTM!

The embeddings route is correctly registered following the established pattern for other endpoints.

src/proxy_config.rs (2)

143-143: LGTM!

The nomic-embed-text model is correctly added to the Tinfoil-only routes without fallback, consistent with how other Tinfoil-exclusive models are configured.


185-185: LGTM!

The model is properly included in the user-facing models list when Tinfoil is configured.

src/web/openai.rs (5)

115-131: LGTM!

The EmbeddingRequest struct is well-designed with appropriate serde attributes. Using serde_json::Value for input allows flexible handling of both string and array inputs, and the default model is correctly set to "nomic-embed-text".


221-227: LGTM!

The embeddings route is correctly registered with the decrypt_request middleware, following the established pattern for other endpoints in this router.


1571-1608: LGTM!

The guest user billing checks and input validation are thorough and consistent with other endpoints. The validation correctly handles empty strings, empty arrays, and invalid input types.


1695-1718: LGTM!

The billing handling for embeddings correctly accounts for prompt tokens only (with completion_tokens: 0), which is appropriate since embeddings don't generate completion tokens. The usage event is properly published when prompt_tokens > 0.


1636-1651: Embeddings endpoint limitation is intentional—nomic-embed-text is Tinfoil-only with no configured fallback provider.

Unlike chat completions which support primary + fallback cycling, nomic-embed-text is configured exclusively for the Tinfoil provider (line 143 in proxy_config.rs with fallbacks: vec![]). The proxy_embeddings function correctly only attempts the primary provider since no fallback is available. This design is intentional and requires no changes.

@AnthonyRonning AnthonyRonning merged commit e23d768 into master Jan 2, 2026
9 checks passed
@AnthonyRonning AnthonyRonning deleted the feat/embeddings-api branch January 2, 2026 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants